Retrieval Status Values in Information Retrieval Evaluation

نویسندگان

  • Amélie Imafouo
  • Xavier Tannier
چکیده

Retrieval systems rank documents according to their retrieval status values (RSV) if these are monotonously increasing with the probability of relevance of documents. In this work, we investigate the links between RSVs and IR system evaluation. 1 IR evaluation and relevance Kagolovsk et al [1] realised a detailed survey of main IR works on evaluation. Relevance was always the main concept for IR Evaluation. Many works studied the relevance issue. Saracevic [2] proposed a framework for classifying the various notions of relevance. Some other works proposed some definitions and formalizations of relevance. All these works and many others suggest that there is no single relevance: relevance is a complex social and cognitive phenomenon [3]. Because of the collections growth nowadays, relevance judgements can not be complete and techniques like the pooling technique are used to collect a set of documents to be judged by human assessors. Some works investigated this technique, its limits and possible improvements [4]. To evaluate and classify IR systems, several measures have been proposed; most of them based on the ranking of documents retrieved by these systems, and ranking is based on monotonously decreasing RSVs. Precision and recall are the two most frequently used measures. But some others measures have been proposed (the Probability of Relevance, the Expected Precision, the E-measure and the Expected search length, etc). Korfhage [5] suggested a comparison between an IRS and a so-called ideal IRS. (the normalized recall and the normalized precision). Several user-oriented measures have been proposed (coverage ratio, novelty ratio, satisfaction, frustration). 2 IR evaluation measures and RSV 2.1 Previous use of RSVs Document ranking is based on the RSV given to each document by the IRS. Each IRS has a particular way to compute document RSV according to the IR model on which it is based (0 or 1 for the Boolean model, [0, 1] for the fuzzy retrieval, [0, 1]) or < for the vector-space,etc). Little effort has been spent on analyzing the relationship between RSV and probability of relevance of documents. This relationship is described by Nottelman et al. [6] by a ”normalization” function which maps the RSV onto the probability of relevance (linear and logistic mapping functions). Lee [7] used a min-max normalization of RSVs and combined different runs using numerical mean of the set of RSVs of each run. Kamps et al. [8] and Jijkoun et al. [9] also used normalized RSVs to combine different kinds of runs. 2.2 Proposed Measures We will use the following notation in the rest of this paper: di is the document retrieved at rank i by the system; si(t) is, for a given topic t, the RSV that a system gives to the document di. Finally n is the number of documents that are considered while evaluating the system. We assume that all the scores are positive. Retrieved documents are ranked by their RSV and documents are given a binary relevance judgement (0 or 1). RSVs are generally considered as meaningless system values. Yet we guess that they have stronger and more interesting semantics than the simple rank of the document. Indeed, two documents that have close RSVs are supposed to have close probabilities of relevance. In the same way, two distant scores suggest a strong difference in the probability of relevance, even if the documents have consecutive or close ranks. But the RSV scale depends on the IRS model and implementation. Different RSV scales should not act on the evaluation. Nevertheless, the relative distances between RSVs attributed by the same system are very significant; In order to free from the absolute differences between systems, we use a maximum normalization: For a topic t, ∀i si(t) = si(t) s1(t) . Thus, ∀i si(t), si(t) ∈ [0, 1] and si(t) < si+1(t). si(t) gives an estimation by the system of the relative closeness of the document di to the document considered as the most relevant by the system (d1) for topic t. For d1, s1 = 1, we consider that si = 0 and s ′ i = 0 for any non-retrieved document. We assume that a lower bound exists for the RSV and is equal to 0. If it is not the case we need to know (or to calculate) a lower bound and to perform a min-max normalization. We propose a first pair of metrics, applicable to each topic; the figure r determines a success rate while e is a failure rate (pi is the binary assessed relevance of document di):

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Performance Evaluation of Medical Image Retrieval Systems Based on a Systematic Review of the Current Literature

Background and Aim: Image, as a kind of information vehicle which can convey a large volume of information, is important especially in medicine field. Existence of different attributes of image features and various search algorithms in medical image retrieval systems and lack of an authority to evaluate the quality of retrieval systems, make a systematic review in medical image retrieval system...

متن کامل

Factors Affecting Student's Scientific Information Retrieval based on Fuzzy Logic Method Compared to Traditional Method

Background and aim: The aim of this study was to identify the factors affecting on students' performance in information retrieval based on fuzzy logic method compared to traditional method. Materials and methods: This survey-descriptive study was performed using quantitative approach. The research population was 34 PhD students, and the researcher-made questionnaire was used. Data were analyzed...

متن کامل

Comparison of Information Retrieval Capabilities in Library Software of Payam, Voyager and Aleph

The purpose of this study was comparing Information Retrieval Capabilities in Web-based Library Software of Payam, with Voyager and ALEPH. A checklist designed and included six main trait for evaluation and comparing 73 scales. Data collected by experts' observing of the software's OPAC. Data analyzed by the descriptive statistics methods. Findings shows the preferences in search capabilities i...

متن کامل

Context-based Information seeking behavior among students of Kharazmi University

Background and Aim: The present study has been done in order to survey contextualized information retrieval behavior by the students of Kharazmi University. Methods: This is descriptive applied research. Statistical population includes all the students currently studying at the Kharazmi University in the time of research. Sample of research includes 196 students selected by convenience sampling...

متن کامل

Boosting Passage Retrieval through Reuse in Question Answering

Question Answering (QA) is an emerging important field in Information Retrieval. In a QA system the archive of previous questions asked from the system makes a collection full of useful factual nuggets. This paper makes an initial attempt to investigate the reuse of facts contained in the archive of previous questions to help and gain performance in answering future related factoid questions. I...

متن کامل

ارزیابی نرم‌افزارهای جامع کتابداری تحت وب پارس‌آذرخش، نوسا و نمایه در بازیابی اطلاعات

Purpose: Since information storage and retrieval is among fundamental tasks of libraries and information dissemination centers, finding proper software in this field is of high significance. The purpose of the present study is to evaluate web-based librarianship software Pars azarakhsh, Nosa, and Namayeh in information retrieval. Methodology: The present study is an applied one. Its methodolog...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005